This article details a hands-on approach to modeling rare events in time series data using Python. It covers data exploration, defining extreme events, fitting distributions (GEV, Weibull, Gumbel), and evaluating model performance using metrics like log-likelihood, AIC, and BIC. The example uses weather data and provides code snippets for implementation.
This article explores the impact of hyperparameters on random forests, both in terms of performance and visual representation. It compares the performance of a default random forest with tuned decision trees and examines the effects of various hyperparameters like `n_estimators`, `max_depth`, and `ccp_alpha` using visualizations of individual trees, predictions, and errors.
Learn how to connect several essential tools to develop a simple yet intuitive dashboard using Streamlit, Plotly, DuckDB, and Pandas to visualize data from a JSON file.
This article explores gamma spectroscopy using a Radiacode 103G detector and Python, detailing data collection, analysis, and experiments with various objects to identify radioactive elements.
ASCVIT V1 aims to make data analysis easier by automating statistical calculations, visualizations, and interpretations.
Includes descriptive statistics, hypothesis tests, regression, time series analysis, clustering, and LLM-powered data interpretation.
- Accepts CSV or Excel files. Provides a data overview including summary statistics, variable types, and data points.
- Histograms, boxplots, pairplots, correlation matrices.
- t-tests, ANOVA, chi-square test.
- Linear, logistic, and multivariate regression.
- Time series analysis.
- k-means, hierarchical clustering, DBSCAN.
Integrates with an LLM (large language model) via Ollama for automated interpretation of statistical results.
This article demonstrates how to use Pandas plotting capabilities for common data visualization tasks, suggesting that Pandas can be sufficient for routine EDA without relying on libraries like Matplotlib.
Quadratic is a modern spreadsheet that combines the familiarity of a spreadsheet with the power of code, allowing you to work with data and code collaboratively in real-time. It supports popular programming languages like Python, SQL, and JavaScript, and offers features such as dynamic charts, APIs, multi-line formulas, and AI integration.